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Abstract: The purpose of the present paper is to estabhsh moderate deviation principles 
p ^ I for a rather general class of random variables fulfilling certain bounds of the cumulants. We 
apply a celebrated lemma of the theory of large deviations probabilities due to Rudzkis, Saulis 
and Statulevicius. The examples of random objects we treat include dependency graphs, 
subgraph-counting statistics in Erdos-Renyi random graphs and [/-statistics. Moreover, 
we prove moderate deviation principles for certain statistics appearing in random matrix 
theory, namely characteristic polynomials of random unitary matrices as well as the number 
of particles in a growing box of random determinantal point processes like the number of 



CN . eigenvalues in the GUE or the number of points in Airy, Bessel, and sin random point fields. 
(N 

1. Introduction 

Since the late seventies estimations of cumulants have not only been studied to show 
convergence in law, but have been studied to investigate a more precise asymptotic analysis 
■ of the distribution via the rate of convergence and large deviation probabilities, see e.g. 
[30] and references therein. In [13] it has been shown how to relate these bounds to prove 
a moderate deviation principle for a class of counting functionals in models of geometric 
probability. This paper provides a general approach to show moderate deviation principles 
via cumulants. 

Let X be a real-valued random variable with existing absolute moments. Then 

r,:=r,(X) :=Hy_logE[e^*^]^^ ^ 
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exists for all j G N and the term is called the jth cumulant (also called semi-invariant) of X . 
Here and in the following E denotes the expectation of the corresponding random variable. 
The method of moments results in a method of cumulants, saying that if the distribution of 
X is determined by its moments and (Xj)j are random variables with finite moments such 
that Tj{Xn) — )■ T j{X) as 72 — 7- oo for every j > 1, then (Xj)j converges in distribution to 
X. Hence if the first cumulant of Xn converges to zero, the second cumulant to one as well 
as all cumulants of Xn bigger than 2 vanish, then the sequence (X„)„, satisfies a Central 
Limit Theorem (CLT). Knowing additionally exact bounds of the cumulants one is able to 
describe the asymptotic behaviour more precisely. Let Z„ be a real-valued random variable 
with mean EZ„ = and variance VZ„ = 1 and 

\^^iZn)\<^-^ (1.1) 

for all j = 3,4,... and all n > 1 for fixed 7 > and A > 0. Here and in the following V 
denotes the variance of the corresponding random variable. Denoting the standard normal 
distribution function by 



one obtains the following bound for the Kolmogorov distance 

sup|P(Z„ < x) - $(a;)| < c^A^ 

where is a constant depending only on 7, see [SUl Lemma 2.1]. By this result, the 
distribution function Fn of Zn converges uniformly to $ as n — ?■ 00. Hence, when x = 0(1) 
we have 

lim ^ ~ ^"^^^^ = 1. (1.2) 

n-!>oo 1 — <I>(x) 

One is interested to have - under additional conditions - such a relation in the case when x 
depends on n and tends to 00 as n — ?■ 00. In particular, one is interested in conditions for 
which the relation ( II. 2p holds in the interval < x < f{n), where f{n) is a non-decreasing 
function such that f{n) — 00. If the relation hold in such an interval, we call the interval a 
zone of normal convergence. In the case of partial sums of i.i.d. random variables with zero 
mean and finite positive variance, it can be shown applying Mill's ratios that f{n) can be 
chosen as (1 — e)(logn)^/^ for any < e < 1, if the third absolute moment of Xi is assumed 
to be finite (see [281 Lemma 5.8]). Moreover, (11. 2p cannot be true in general since for the 
symmetric binomial distribution the numerator vanishes for all x > -y/n. For i.i.d. partial 
sums the classical result due to Cramer is that if Ee*''^^'^'^^ < 00 for some t > 0, (11. 2p holds 
with /(n) = o(n^/^). In [301 Chapter 2], relations of large deviations of the type (11. 2p are 
proved under the condition (11.10 on cumulants with a zone of normal convergence of size 
proportional to A^+^t , see Lemma 2.3 in [30] . 

The aim of this paper is to show that under the same type of condition on cumulants of 
random variables Zn moderate deviation principles can be deduced. Actually we will go the 
detour via large deviation probabilities, showing that under condition (II. ip . the deducible 
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results on large deviations probabilities imply a moderate deviation principle. For partial 
sums Sn of i.i.d. random variables {Xi)i one can find in |26] the remark, that large deviation 
probability results imply asymptotic expansions for tail probabilities P{Sn > nE{Xi)+n^^'^x) 
and P{Sn < nE{Xi) —n^^'^x) for x > and x = o(n^/^) and moreover, that these expansions 
imply a moderate deviation principle. We have not found the general statement proven in 
the literature, that large deviation probability results imply in general a moderate deviation 
principle. Our abstract result. Theorem II. ![ is motivated by various applications. We will 
prove moderate deviation principles for a couple of statistics applying Theorem 11.11 Some 
results will be improvements of existing results, most of our examples are new moderate 
deviation results. 

Let us recall the definition of a large deviation principle (LDP) due to Varadhan, see for 
example [10]. A sequence of probability measures {(/i„),r2 G N} on a topological space X 
equipped with a cr-field B is said to satisfy the LDP with speed s„ oo and good rate 
function /(■) if the level sets {x : I{x) < a} are compact for all a G [0, oo) and for all P G i3 
the lower bound 

liminf — log/i„(P) > — inf I{x) 

n-s-oo Sn x£mt{T) 

and the upper bound 

limsup — log/in(P) < — inf I{x) 

n—^oo S-n 

x6ci(r) 

hold. Here int(P) and cl(P) denote the interior and closure of P respectively. We say a 
sequence of random variables satisfies the LDP when the sequence of measures induced by 
these variables satisfies the LDP. Formally a moderate deviation principle is nothing else 
but the LDP. However, we will speak about a moderate deviation principle (MDP) for a 
sequence of random variables, whenever the scaling of the corresponding random variables 
is between that of an ordinary Law of Large Numbers and that of a Central Limit Theorem. 

The following main theorem of this paper generalizes the idea in [T3] to use the method 
of cumulants to investigate moderate deviation principles: 

Theorem 1.1. For any n G N, let Zn be a centered random variable with variance one and 
existing absolute moments, which satisfies 

|r,(Z„)| < (jO^+VA^^ forallj = 3A,... (1-3) 

for fixed 7 > and An > 0. Let the sequence (a„)„>i of real numbers grow to infinity, but 
slow enough such that 

Oin n— >oo „ 

A 1/(1+27) ^ ^ 
i-in 

holds. Then the moderate deviation principle for (^Z„)^ with speed and rate function 
I{x) = Y holds true. 
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The Theorem opens up the possibihty to prove moderate deviations for a wide range of 
dependent random variables. Before we will proceed, we will consider a moderate devia- 
tion principle for partial sums of independent, non-identically distributed random variables. 
Interesting enough, we have not find any reference for the following result. 

Theorem 1.2. Let (Xj)j>i be a sequence of independent real-valued random variables with 
expectation zero and variances af > 0, i > 1, and let us assume that 7 > and K > Q exist 
such that for all i > 1 

|EX/|<(j!)i+^A'^-V2 forallj = 3A,.... (1-4) 

Let Zn '■= / li 2 ^r=i -^i- Then (^-^Zn)^^^ satisfies the moderate deviation principle with 

^ 1/(1+27) 

speed al and rate function 2- for any 1 ^ a„ <^ ( V — ^ 

^ 12 maxj A ; ma^ Wi}} 

Remark that condition (11.41) is a generalization of the classical Bernstein condition (7 = 0). 

Proof. Using a relation between moments and cumulants, condition (II. 4p implies that the 
j-th cumulant of Xj can be bounded by {j\y~^'^{2max{K,ai}y~^af. Hence it follows from 
the independence of the random variables Xi, i > 1, that the j-th cumulant of Z„ has the 
bound 

VeL^ -1 , (1-5) 

for details see for example [30| Theorem 3.1]. Thus for Z^ the condition of Theorem 11.11 
holds with 



A. 



" 2 maxl/T: max jcTv 1 1 

The result follows from Theorem 11.11 □ 

Remark 1.3. If Cramer's condition holds, that is there exists A > such that Ee^'"^'' < 00 
holds for all i E N, then satisfies Bernstein's condition, which is the bound (II. 4p with 
7 = 0, see for example [331 Remark 3.6.1]. This implies (ll.Sp and we can apply Theorem 
11.11 as above. Therefore Theorem 11.11 requires less restrictions on the random sequence than 
Cramer's condition. 

The paper is organized as follows. Section 2 is devoted to applications for so called 
dependency graphs including counting-statistics of subgraphs in Erdos-Renyi random graphs. 
Section 3 presents applications to [/-statistics. Theorem 1 1 . 1 1 and Theorem [L2] will be applied 
in random matrix theory in Section 4. We will be able to reprove moderate deviations for the 
characteristic polynomials for the COE, CUE and CSE matrix ensembles. Moreover we will 
prove moderate deviations for determinantal point processes with applications in random 
matrix theory. Finally, in Section 5 we present the proof of Theorem 11.11 
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2. Applications to dependency craphs 

Let {^ojagj tie a family of random variables defined on a common probability space. A 
dependency graph for {A^ojagj graph L with vertex set X which satisfies the following 

condition: For any two disjoint subsets of vertices Vi and V2 such that there is no edge 
from any vertex in Vi to any vertex in V2, the corresponding collections of random variables 
{Xa}a£Vi and {Xa}aeV2 ^-re independent. 

Let the maximal degree of a dependency graph L be the maximum of the number of edges 
coinciding at one vertex of L. The idea behind the usefulness of dependency graphs is that 
if the maximal degree is not too large, one expects a Central Limit Theorem for the partial 
sums of the family {-^aj^gj- We will consider moderate deviations. Note that there does 
not exist a unique dependency graph, for example the complete graph works for any set of 
random variables. 

Example 2.1. A standard situation is, that there is an underlying family of independent 
random variables {l^ijie^, and each Xa is a function of the variables {V'j}jg_4^, for some 
Aa C A. With S = {Aa : a E 1} the graph L = L{S) with vertex set X and edge set 
{af3 : Aq, n 7^ 0} is a dependency graph for the family {A'^j^gj. As a special case of this 
example, we will consider subgraphs of an Erdos-Renyi random graph. 

Another context, outside the scope of the present paper, in which dependency graphs are 
used is the Lovasz Local Lemma, see [3]. Central limit theorems for Z := Xlaex"^" 
obtained in [5], see [HI Theorem 9.6] for corresponding Berry-Esseen bounds. We obtain the 
following bounds on cumulants of Z: 

Theorem 2.2. Suppose that L is a dependency graph for the family {Xa}a& O'nd that M 
is the maximal degree of L. Suppose further that |Aq,| < A almost surely for any a G X and 
some constant A. Let cr^ he the variance of Z := Yla&x-^a- Then the cumulants Tj of ^ are 
bounded by 

\T,\<{mi\{M + iy-\2eAy^ (2.6) 

for all J > 1 • 

Proof. For notational reasons we consider without loss of generality the case where the index 
set X is chosen to be X = {1, . . . , N} for any fixed natural number A G N. In [T9| Lemma 4] 
bounds for the cumulants were given. Our main task is to obtain a bound, which gives the 
dependency of j (and j!) as exact as possible. The first steps of our proof can exactly be 
found in [191 Lemma 4]. Assuming the existence of the m-th moments of Ai, . . . , A^ define 
the multi-linear function 

«:(Ai, . . . , A,) := (-2)^-- logE[exp(2tiAi) ■ ■ ■ exp(2t,A,)] 

Oil ■ ■ ■ Otj (ti,...,tj)=(0,...,0) 
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Per definition for any random variable X the cumulant is given by ^j{X) = k{X, . . . ,X) 

j times 



and for tlie cumulant of — we have 

TV N 



N N 



(2.7) 



i=l 1=1 
j times 



Suppose that Xi, . . . , Xm are independent of X^+i, . . . , Xj for any 1 < m < j, then 



0. 



dti - ■ ■ dtj 
dti - ■ ■ dtj 



(o,...,o) 



dti - ■ ■ dtj 



logE[exp(itiXi) ■ ■ ■ exp{itjXj)] 
log E [exp(itiXi) ■ ■ ■ exp{itmXm, ^ 

{o,.--,o) 

logE[exp(it„+iX^+i) ■ ■ ■exp(itjXj)] 



(0,...,0) 



Thus in (12. 7p we only have to consider those terms ^(Xjj, . . . , Xj^,) for which the correspond- 
ing j vertices of L (not necessarily distinct) form a connected subgraph. 

For I < q < j, let iq denote the summation over all partitions of {1, . . . ,j} into 

m nonempty subsets, 1 < m < q. The representation of a cumulant in [30| Eq. (1.57)], 
which was derived by Leonov and Shiryaev in 1959 via Taylor's expansion, gives 



^{-^il ! • • • ! Xj, 



= E E 

q=l Il,...,Iq 

j 

s EE 

9=1 Iu-,Iq 



-iy-\q-l)\l[E 



m=l 

q 



rein 



m=l rein 



{21 



applying Holder's inequality with Xlie/ ~ ^ ^"^^ symbolizing (E|Xj^|™')^'''"' by \\Xi^ 
Choosing rrii = \Im\ and using the fact that < ||j for mj < j implies 



q=l Il,...Jq 



m=l rG/m 

x^A\r■■\\x^^\,i2 E l(-ir'(g-i)! 



(2.9) 



9=1 Il,-,Iq 

The number of partitions of an set containing j elements into q parts is the Stirling number 



sE(-ir"'( 



■m=0 
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And inequality (12. 9 p implies 



«:(X,,,...,X,J<||X,J|,---||X,J|,5^ 



q=l m=0 



Since 



m=0 m=l 



9 fri " frt V L9/2J j"' ^ { km )'''-^\ [j/2\ 



m=0 m=l 

holds, we can apply ^ j^^.^gj ) = (b72j)'!(rj/2])! — ([j/2'j)!2 Stirling approximation m! > 

(f)"" to get 

«:(X,„...,X,J < ||Xi||,---||X,||,--j-'-+i- 



< ||Xi||,--.||X,||,-j!-(2ey < j\{2eAy. (2.10) 

Now we need to know the number of possible sets of j vertices forming a connected 
subgraph of L. If wi, . . . , Vj are j such vertices, then we can rearrange the indices such that 
each set {vi,V2}, {vi, V2, fs}, . . . , {vi, . . . ,Vj} forms itself a connected subgraph of L. There 
are at most j\ tuples of j vertices associated to the same ordering. There are N ways of 
choosing Vi. The vertex V2 must equal Vi or be connected to Vi, for which we have the choice 
of at most M possible vertices. Similarly, ^3 either equals Vi or V2 or is connected to one of 
them. For this choice we have at most 2 + 2M = 2(M + 1) possibilities. Continuing this 
way we see that there are at most 

j\N{M + 1)2(M + 1) ■ ■ ■ (j - 1)(M + 1) = j!(j - 1)!X(M + 1)^'"^ 

choices of j vertices forming a connected subgraph in L. 

Inserting this estimation and the bound in (]2.10p into equation (12. 7p completes the proof 
of Theorem O □ 

2.1. Subgraphs in Erdos-Renyi random graphs. Consider an Erdos-Renyi random 
graph with n vertices, where for all (2) different pairs of vertices the existence of an 
edge is decided by an independent Bernoulli experiment with probability p. For each 
i G {1, . . . , (2)}; Ifit be the random variable determining if the edge Cj is present, i.e. 
P{Xi = 1) = l — P{Xi = 0) = p{n) =: p. The model is called G{n,p). The following statistic 
counts the number of subgraphs isomorphic to a fixed graph G with k edges and / vertices 

W= Yl l{(e«,,...,e.,)~G} rnxj . (2.11) 

i<Ki<...<Kfc<(;;) Vi=i / 

Here (e^i, . . . , e^j.) denotes the graph with edges e^^, ■ ■ ■ , e^^ present and A ^ G denotes the 
fact that the subgraph A of the complete graph is isomorphic to G. Here and in the following 
we speak about connected subgraphs only. Let the constant a := aut(G) denote the order of 
the automorphism group of G. The number of copies of G in the complete graph with 
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n vertices and (2) edges, is given by and the expectation of W is equal to K[W] 



p'^ = 0{n''p^) . It is easy to see that P{W > 0) = o(l) ii p <^ n . Moreover, for the 
graph property that G is a subgraph, the probabihty that a random graph possesses it jumps 
from to 1 at the threshold probability n"^/'"^'^^ where m{G) = max : H <Z G,Vh > 0^ , 
chiVh denote the number of edges and vertices of if C G, respectively, see [21]. Rucihski 



proved in [29] that '^^^'^^ converges in distribution to a standard normal distribution if 
and only if 

r,pm[G)n^^ ^2^1_p)n_^^_ (2.12) 

An upper bound for lower tails was proven by Janson [20], applying the FKG-inequality. 
A comparison of seven different techniques proving bounds for the infamous upper tail can 
be found in [22], see also [7] for a recent improvement. The large deviation principle for 
subgraph count statistics in Erdos-Renyi random graphs with fixed p are solved in [8]. 

As a special case of Example 12. H let {i^ajoex be given subgraphs of the complete graph 
Kn and let be the indicator that Ha appears as a subgraph in G(?7,,p), that is, la = 
l{//aCG(n,p)}5 a G X. Then L{S) with S = {en^ : a G X} is a dependency graph with edge set 
{a P : l~l ^Hi3 7^ 0}- Here we take the family of subgraphs of that are isomorphic to 
a fixed graph G, denoting by {Ga}aeAn- Consider Xa = la — E-^a and define the graph L„ 
by connecting every pair of indices a and /3 such that the corresponding graphs Ga and Gp 
have a common edge. This is evidently a dependency graph for {Xa)a&Ar,] see [2T1 Example 
6.19]. Note that the subgraph count statistic W — ¥W given in (12. lip is equal to the sum 
of all Xa, l<a< An. 

We will be able to prove the following moderate deviation principle for the subgraph count 
statistic: 

Theorem 2.3. Let G he a fixed graph with k edges and I vertices. Let {an)n be a sequence 
with 

1/5 



where e is Euler's number. Then the scaled subgraph count statistics satisfy the 

moderate deviation principle with speed a\ and rate function x^/2 if 

^2^3(2fc-l)(l_p)3n_^^ (2.13) 

holds. 

Remark 2.4. Condition 02.131] on p{n) assures that (an)n grows to infinity. Moderate de- 
viations for the subgraph count statistic of Erdos-Renyi random graphs are already con- 
sidered in [11] studying the log-Laplace transform via martingale differences and using 
the Gartner-Ellis Theorem. The stated moderate deviation principle in Theorem 12.31 is 
on one hand valid for more probabilities p{n) than in [Til Theorem 1.1]. But on the 
other hand the scaling fin '.= anV^W has a smaller range in comparison to [11]: Using 
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const, n^' "^p^^ ^{l — p) < YW < const, n^' '^p'^'' ^(1 — p) (see [291 2nd section, page 5]) the 
scaling in Theorem 12.31 is equal to 

Where the scaling in Theorem [TTl Theorem 1.1] is bounded by: 

y s 4 

n^'b'"^ Vpi^ - P) < /3n < (/^^ \/p(l . 



Proof of Theorem \2.3l In order to prove Theorem 12.31 we apply Theorem 12.21 to show that 
the conditions of Theorem 11.11 are satisfied. Let us consider the subgraph count statistic 
in an Erdos-Renyi random graph for any fixed subgraph G with / vertices and k edges and 
its associated dependency graph L„ defined as above. Let M„ be the maximal degree of 
the dependency graph L„. Thus to determine M„ we need to bound the maximal number 
of subgraphs isomorphic to G having at least one edge in common with a fixed subgraph 
G' which is itself isomorphic to G. For every subgraph G', isomorphic to G, we have to 
consider one of the k edges of G' to be the common edge. Accordingly we can choose 
/ — 2 further vertices out of n — 2 possible vertices - which justifies a factor {n — 2)i_2 := 
(n — 2)(n — 1) ■ ■ ■ (n — / — 1). We can substract one solution, because we do not count G' 
itself and achieve 

M„ < k{n - 2);_2 - 1 < kn^-^ - 1 . 
The number A^^^ of the subgraphs in Kn which are isomorphic to G satisfies the inequality 
n{n — 1) ■ ■ ■ {n — I — 1) 



ft \ 

^) <Nr,<ni = n{n-l)---{n-l-l) 



As stated on page|9]the variance o"^ = NW of = W-EW is bounded by a constant 

times n^'~^p^*^~^(l — p). For the cumulants of '^"^ it follows with (12. 6 p that, for j > 3, 



Ir^l < (j!)V(A;n'-2)^'-i(2e) 



(j!)3^F-i(2eV 



V. 



1 



(const. n'~^p'^^^A/j5(l — p))" 



{const.p''-^y/p{l-p)Y 

< (fif ( , ^ ff= =^]' . (2.14) 

yn^const.p" vp(l— pjj / 

In the last inequality we used the fact that 3(j — 2) > j is equivalent to j > 3. This imphes 
that condition (11.31) is satisfied for 7 = 2 and 

n (const .p'^"^ \^p{^ — p)) ^ 
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A„ increases if ?7,2p3(2fc i)^]^ —p)"^ following we can only consider this case. 

Now we choose a sequence (a„)„ such that 



1 « a„ « Ay(i+2^) 



and apply Theorem 11.11 which ends the proof of Theorem 12.31 □ 

Remark 2.5. As mentioned in the introduction, the cumulant bounds (11. 3p imply a Central 
Limit Theorem if limn_5.oo A„ = oo. Moreover applying [301 Lemma 2.1] and inequality f l2.14p 
proves the following bound for the Kolmogorov distance: 



sup 



-EW , , 



, \/2 ^ \ ^"""^^ const. 
< 108 — A„ < 



6 "/ nV5(pfc-yp(i -p)f'^ ' 



This bound is weaker than the inequality induced in [6] via Stein's method. For some 
improvements see [16]. 



2.2. Another example of a dependency graph. Let Xj, i > 1, be independent cen- 
tered random variables with existing variances VXj > e for any £ > and define Zn '■= 
^"^j^XjXj+i. Let A be a constant such that \Xi\ < ^/A almost surely. Let (a„)„ be a di- 



vergent sequence where a„ ^ n^^^^. Then ( — ^=Zn) satisfies the moderate deviati 
principle with speed and rate function J(x) = x^/2. 



Proof. Set Yi := XjXj+i for alH = 1, . . . , n. Yi is independent of Yj for all j not equal to 
i — 1 and i + Let L„ be the graph with vertex set {1, . . . , n} and edges between 1 and 2, 2 
and 3, ... as well as between n — 1 and n. Ln is a dependency graph of {Yi}^^ with N = n 
and M = 2. The variance = YZn is bigger or equal than a constant times n: 

n n 

i,j=l jj=l 
n n n—1 

= 5^ E [X^X,^,] + ^ E [X,X,+iX,_iX,] + ^ E [X,X,+iX,+iX,+2] 

i=l 1=2 i=l 

n n 



J2 VV(X,)V(X,+i) + 2 J] E[X,_i]E[X2]E[X,. 

i=l i=2 

n 

V v/V(X,)V(X,+i) > n min V(X,) 

' j=l,...,n+l 



1=1 



due to the independence of Xi, . . . ,X„+i and the fact that their expectations are equal to 
zero. In particular, for independent and identically distributed random variables Xj, we have 
YZn = const. ■ n. Using Theorem 12.21 we have a bound for the cumulant of -7^7= X^ILi ~ 
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|r.l <(j!f { 



Now we can apply Theorem 11.11 with 7 = 2, A„ = const. and a sequence (a„)„ satisfying 
_an ^'i^ g_ This proves the claim. □ 

3. Application to non-degenerate [/-statistics 

Let Xi, . . . ,X„ be independent and identically distributed random variables with values 
in a measurable space X. For a measurable and symmetric function h : A""* — M we define 



■.mJ l<ii<---<im<n 



where symmetric means invariant under all permutations of its arguments. Un{h) is called a 
U-statistic with kernel h and degree m. Define the conditional expectation for c = 1, . . . , m 

by 



hc{xi, . . . ,Xc) ■— K[h{xi, . . . ,Xc,Xc+l, . . . ,Xm 

= E[h{Xi, . . . ,X„)|Xi = xi, . . . ,Xc = Xc] 

and the variances by a'^ := V[/ic(Xi, . . . , X^)] . A U-statistic is called degenerate of order d 
if and only if = = ■ ■ ■ = crj < (tJ_,„^ and non- degenerate if af > 0. As is well known, the 
weak limits of appropriately scaled [/-statistics depend on the order of degeneracy. By the 
Hoeffding-decomposition (see for example [25]), we know that for every symmetric function h, 
the [/-statistic can be decomposed into a sum of degenerate [/-statistics of different orders. 
In the degenerate case the linear term of this decomposition disappears. On the level of 
moderate deviations, in [15] the MDP for non-degenerate [/-statistics is investigated; the 
proof used the fact that the linear term in the Hoeffding-decomposition is leading in the non- 
degenerate case. Moreover in [15], moderate deviation principles for Banach-space valued 
degenerate [/-statistics were established, with bon-convex rate functions. 

In the present paper the observed U-statistics are assumed to be non-degenerate. The 
main result is: 

Theorem 3.1. (Moderate deviations for non- degenerate U-statistics) 

Let Xi, X2, ... be a sequence of independent and identically distributed random variables and 



l<ii <i2<n 



a non- degenerate U-statistic of degree two. Let a\ := V (E[/i(Xi, X2)|Xi]) < 00 and suppose 
that there exist constants 7 > 1 and C > such that 

E[|MXi,X2)P] <C^{j\r (3.15) 
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for all j > 3. Defining 

{ - ,ifC<(Ti 

and An := ('^^/^§(^\ {o-nju be a sequence growing to infinity such that 



n— >oo 



.1/(1+27) ^ ' 

Then ( ^" ] satisfies the moderate deviation principle with speed ai and rate func- 

\a„WY(U„) J n&N 



tion I(x) 



,.2 



Remark 3.2. Let us discuss the conditions f l3.15p and fl3.16p in detail. 

a. In [15] a moderate deviation principle for degenerate and for non-degenerate U-statistics 
with a kernel function h, which is bounded or satisfies exponential moment conditions, was 
considered (see also [12] )• In [ISj the exponential moment conditions for a non-degenerate 
^/-statistic of degree two reads as follows: the function hi of the leading term in the Hoeff ding- 
decomposition has to satisfy the weak Cramer condition: f exp(a||/ii||)(iP < oo for a a > 0. 
Moreover h2 has to satisfy the condition that there exists at least one ah > such that 
/ exp(a/i||/i2|P)c?P^ < oo. The MDP in [T^] was proved for 1 ^ a„ ^ ^/n. Since the leading 
term of the Hoeffding decomposition is a partial sum of i.i.d. random variables, the weak 
Cramer condition on hi can be relaxed. A necessary and sufficient condition is given in [13] 
which is 

limsup — log(r2P(|/;,i(Xi)| > y/nan)) = — oo. 

The strong condition on h2 is due to the fact, that a Bernstein- type inequality for the 
degenerate part of the Hoeffding-decomposition was applied, see [151 Theorem 3.26]. Un- 
fortunately is is not obvious how to compare condition f l3.15p with the conditions in [T5] . 
Condition fl3.15p is a Bernstein-type condition on the moments of h, which is equivalent to 
a weak Cramer condition on h. We haven't no assumptions on /i2, hence f l3.15p seems to 
be weaker. On the other side, even in the case of the best bounds (7 = 1) in f l3.15p . our 
result is restricted to 1 ^ a„ <^ n^^^. The prize of less restrictive conditions on h seem to 
be that the moderate deviation principle holds in a smaller scaling-interval. Our Theorem 
is an improvement of [15] for some a„. 

b. We can also compare the result in Theorem 13 . 1 1 wit h the result in [HI Theorem 3.1], which 
was deduced via the Laplace transform. Let the kernel function h be bounded. Obviously 
condition f l3.15p is fulfilled with 7 = 1 and according to Theorem 13. II the object 



Un 



satisfies the MDP with speed and rate function I{x) = ^ for every sequence (an)„ growing 
to infinity slow enough such that 1 ^ a„ <^ n^^^. 

Let {bn)n be a sequence satisfying y/n <^ hn ^ n. From [HI Theorem 3.1] it follows 
that if^Un)^ satisfies the MDP with speed ^ and rate function I{x) = Choosing 
bn = nany/YUn in [Til Theorem 3.1] requires the scaling y/n <^ bn = nany/Wn <^ n. 
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Applying that nVt/„ = 4a? + 0(i) gives 1 <C a„ <C ^Jn. From [HI Theorem 3.1] we obtain, 
that = satisfies the MDP with speed nalNUn = alAal: + Oi^) and rate 

function /(x) = This is the same result as stated above via Theorem 13.11 Therefore 

^ ^ OCT 

the MDP via the log-Laplace transform holds for a larger scaling range. But [HI Theorem 
3.1] assumed bounded [/-statistics, and thus Theorem 13.11 is valid for more general kernel 
functions h for some a„. 

Proof. According to [2], see [30} Lemma 5.3], the cumulant of Un can be bounded by 

|r,(f/„)|<2e2(^-2)^C^-(j!)^+^- ^ 



for all j = 1, 2, . . . , n — 1 and n > 7. The quite involved proof is presented in [30]. The 
variance for the non-degenerate [/-statistic is given by V(f/„) = -^^E^ + n{n-i) ' Theorem 
3 in [251 chapter 1.3]. Therefore it exists an uq > 7 big enough such that ^/Y{Un) > 
The following bound holds for the cumulants of 



i-2 

|r,| < 



n 



for all j = 3, . . . , — 1 and n > uq. Applying Theorem 11.11 f l3.16p . with A„ = (^j^/^~§(^^ 
is a sufficient condition for the moderate deviation principle. □ 

Remark 3.3. Let us remark, that known precise estimates on cumulants will enable us to 
prove moderate deviation principles for further probabilistic objects. Examples are polyno- 
mial forms. Pitman polynomial estimators and multiple stochastic integrals (see (SO]). This 
will be not the topic of this paper. 



4. Moderate deviations for the characteristic polynomials in the circular 

ensembles 

In the last decade, a huge number of results in random matrix theory were proved. Some 
of the results were extrapolated to make interesting conjectures on the behaviour of the 
Riemann zeta function on the critical line. It is known that random matrix statistics describe 
the local statistics of the imaginary parts of the zeros high up on the critical line. The 
random matrix statistic considered for this conjectural understanding of the zeta-function is 
the characteristic polynomial Z{9) := Z(U, 9) = det(/ — t/e"*^) of a unitary nxn matrix U. 
The matrix U is considered as a random variable in the circular unitary ensemble (CUE), that 
is, the unitary group U (n) equipped with the unique translation-invariant (Haar) probability 
measure. In [23] exact expressions for any matrix size n are derived for the moments of |Z| 
and from these the asymptotics of the value distribution and cumulants of the real and 
imaginary parts of logZ as n — )■ cxd are obtained. In the limit, these distributions are 
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independent and Gaussian. In [23] the results were generalized to the circular orthogonal 
(COE) and the circular symplectic (CSE) ensembles. The goal of this section is to prove 
a moderate deviation principle for the appropriately rescaled log Z for the three classical 
circular ensembles applying Theorem 11.11 Remark that our result is known for CUE, see 
[TS| Theorem 3.5], see Remark 14.21 We present a different proof and generalize the result to 
the COE and CSE ensembles. We start with the representation of Z{U,6) in terms of the 
eigenvalues 6*^*= of U : 

n 

Z{U,e) = det(/ - Ue-'') = - e*^'^"'^)- 

k=l 

Let Z now represent the characteristic polynomial of an n x n matrix U in either the CUE 
(/3 = 2), the COE (/? = 1), or the CSE (/3 = 4). The C^E average can then be performed 
using the joint probability density for the eigenphases 6k 

(/?/2)!" 



n 



M/2)!(27r)" 

^ ' I ' ^ I l<j<m<n 



e ^ - e 



Hence the s-moment of |Z| is of the form 

\ y/ J \ J JO JO l<j<m<n k=l 

This integral can be evaluated using Selberg's formula, see [27], which leads to 

ny^s. _ A r(i+j/j/2)r(i + . + j/3/2) 

^1 n (r(l + ./2 + j/3/2)P 

denoting the gamma function by F (without an index). Hence log(|Z|^)^ has an easy form 
and equals at the same time by definition ^-^jp-s^, where Tj{l3) = Tj{^\og Z) denotes 

the j'-th cumulant of the distribution of the real part of logZ under Cf3E. Differentiating 
log(|Z|*)^ one obtains 

r,(/3) = ^^-^Xi ^^'"'^(1 + ^/5/2)' (4-17) 



2^- 

k=0 



where 



zt 



for z E C with > are the polygamma functions, see [H 6.4.1]. The result of this section 
is: 



Theorem 4.1. Let (a„)„gN be a sequence in M such that 1 ^ a„, ^ \/logn holds. The 
sequence of random variables ( ^ ^ ) and ( ^ ) under the averaqe over the 

CPE of n X n matrices satisfy a moderate deviation principle for (3 = 1,2 and 4 with speed 
and rate function I{x) = 
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Remark 4.2. Theorem 14.11 for P = 2 states the same moderate deviation principle as in 



[T8l page 440, Theorem 3.5] for the same scaling range 1 ^ a„ ^ ylogn - but the speed 
in Theorem 14.11 here is given more explicit: The speed 6„ of moderate deviations in 



Theorem 3.5] is given by = — v-, where denotes the Lambert's VF-Function. 

W-i ( - J 

The Lambert's ly-Function solves the equation W{x)e^^^^ = x and W-i denotes the real 
branch with W-i{x) < —1. For negative x tending to zero we get the following asymptotic 
behaviour: W-i{x) = log |x| + 0(log|log This implies that the limiting speed behaves 
hke 

log ri — log a„cr„, logn 

Additionally in [181 Theorem 3.5] the asymptotic behaviour of ^^^^^ for scaling ranges 
an = y/logn and i/log n <^ an ^ n/^/^ogn is considered. The circular orthogonal and 
circular symplectic ensembles were not studied in jl8] . 



Proof of Theorem 4-1 ■ In [231 ^q. (47)] an integral representation of the cumulants of 3ft log(Z) 



for the case /3 = 2 is derived and an outline of the extension to /3 = 1 and 4 is given. Simi- 
larly we prove a bound of the cumulants satisfying the condition (II. 3p for these three circular 
ensembles. With ( I4.17P the cumulant can be written as 

r,(5Jiog(z)) = ^^£^(-^^1 + 4) 

fc=0 

= > -IV / 7 — dt 

k=0 

-IV / t^^ \ ' . ' dt 



2^"i ' Wo 1 - e-* 1 _ * 



2J 



-1 



r=0 s=0 

using properties of geometric series for the last two equalities. Thus we have 

nj^l 1 POO 

r.(5?log(Z)) = / t^-'e-^'^''Hl-e-^')dt. 

^ r=0 8=1 

To get a representation via the gamma function we integrate by substitution needing a 
prefactor (s + r^y~^ for t-'"^ and the derivative {s + r|) of {s + r^)t: 

\r=0 s=l ^ ' 2) r=n s=l ^ ' 2 ) 



2^'^ - 1 



oo oo 



< ' . , ' (-ivr(j) W — (4.18) 
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In the case /3 = 1 we can estimate the sum as follows: For r G Nq and s G N the integer 
k = 2{s + ^) = 2s + r can be written in A;/2 number of ways if k is even, in no way if = 1, 
and in ways otherwise. 



oo oo 



r=0 s=l ^ 2^ 



k 



_2k_ ^ 2k -1 ^ 



^^{2ky i^^{2k-iy ^(2fc-i)^ 

= 2^-i(c(j-2) + (l-^)C(j-l)), 
applying the fact that EZii^^ - l)"^' = EZi^-' - EZii^k)-^ = (l " ^)C(j - !)• 

2 

Bounding the zeta function by this gives 

oo oo 2 2 

s + ^ V - 6 3 

For /3 = 2 we immediately get: 

oo oo ^ oo 2 

EE7;-7|^ = E^ = C(i-i)<y- 



r=0 s=l ^ 2 ' 



k=l 



The case /3 = 4 can be considered similarly, see [231 P-84]: Counting the ways in which 
k = s + 2r this yields 

n— 1 oo ^ ^ 1 \ ^ 

r=0 s=l V ' 7 \ / 

Together with equation f l4.18p we can conclude that 



riog(Z)x 



|r,5Jlog(Z)| ^ 2^-1-1 ^,.,^^ 1 



a: 



n,/3 



< r(j)- 



0": 



J El 



n,fi 1^ 6 



2i-iz^ for /3 = 1 

for /3 = 2, 4. 



In order to read the parameters 7 and 5 we apply that the variance of Z is bounded from 
below by a'^ ^ > > Finally we have 



2^\ for /3 = 1 
<(j!)^<( 4^ for/3 = 2 V < (jC 



(8|!)^-2 for/3 = l 



'^)^-' for/3 



8^ for /3 = 4 



0": 



i-2 



(^)'"' for/3 = 4 

(4.19) 



for all j > 3, hence equation (11. 3p is satisfied for 7 = and A„ = ^g^/ . Theorem 11.11 
completes the prove for 3fJ log(Z) . Since the j'-th cumulant of the distribution of the imaginary 
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part of log Z can be bounded by the j-th cumulant of the distribution of the real part of 
logZ for all j > 3, see [23l eq. (62)], the MDP of Slog(Z) follows immediately. □ 

Remark 4.3. Dyson observed that the induced eigenvalue distributions of the Cf5E en- 
sembles correspond to the Gibbs distribution for the classical Coulomb gas on the circle at 
three different temperatures. Matrix models for general /3 > for Dysons's circular eigen- 
value statistics are provided in [21] , using the theory of orthogonal polynomials on the unit 
circle. They obtained a sparse matrix model which is five- diagonal. In this framework, 
there is no natural underlying measure such as the Haar measure; the matrix ensembles are 
characterized by the laws of their elements. 



5. Moderate deviations for determinantal point processes 

The collection of eigenvalues of a random matrix can be viewed as a configuration of points 
(on M or on C), that is a determinantal process. Central Limit Theorems for occupation 
numbers were studied in the literature, see |4] and references therein. This section is devoted 
to the study of moderate deviation principles for occupation numbers of determinantal point 
processes. We will see that it will be an application of Theorem 11.21 

Let A be a locally compact Polish space, equipped with a positive Radon measure [i on 
its Borel cr-algebra. Let A^+(A) denote the set of positive cr-finite Radon measures on A. 
A point process is a random, integer- valued x G A^+(A), and it is simple if P(3a; G A : 
x({x}) > 1) = 0. A locally integrable function g : ^ [0, oo) is called a joint intensity 
(correlation), if for any mutually disjoint family of subsets Di, . . . , of A 

E(TTx(A)) = / ^fc(xi, . . . ,a;fc)(i/i(xi) ■ ■ ■c?/i(xfc), 

where E denotes the expectation with respect to the law of the point configurations of x- A 
simple point process x is said to be a determinantal point process with kernel K if its joint 
intensities Qk exist and are given by 

k 

Qk{xi,...,Xk) = dei{K{xi,Xj)). (5.20) 
« j=i 

An integral operator /C : LF'{h) — )■ L'^in) with kernel K given by 

/C(/)(x) = j Kix,y)fiy)dM, f G ^^(/i) 

is admissible with admissible kernel if /C is self-adjoint, nonnegative and locally trace-class 
(for details see [H 4.2.12]). A standard result is, that an integral compact operator /C with 
admissible kernel K possesses the decomposition 

n 
k=l 
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where the functions (pk are orthonormal in L'^{fi), n is either finite or infinite, and > for 
all k, leading to 

n 

K{x,y) = J2>^kMx)<Pliy)^ (5-21) 
fc=i 

an equality in x fi). Moreover, an admissible integral operator /C with kernel K is 

called good with good kernel K if the in fl5.2ip satisfy A^ G (0, 1]. If the kernel K of 
a determinantal point process is (locally) admissible, then it must in fact be good, see [U 
4.2.21]. 

Example 5.1. If (Ai, . . . , A„) be the eigenvalues of the GUE (Gaussian unitary ensemble) 
of dimension n and denote by Xn the point process Xn{D) = X]r=i ^{^iSD}- Then Xn is 
a determinantal point process with admissible, good kernel K{x,y) = J22=o k{x)'if k{y) , 

where the functions are the oscillator wave-functions, that is \l/fc(x) := - — , ''^^ , where 
Hi,{x) := (-l)'=e^'/2^e-^'/2 jg ^j^g ^.^^ Hermite polynomial; see H Def. 3.2.1, Ex. 4.2.15]. 

We will apply the following representation due to Theorem 7]: Suppose x is a deter- 
minantal process with good kernel K of the form 05.211) . with J2k-^k < Let {Ik)]^^i be 
independent Bernoulli variables with P{Ik = 1) = A^. Set 

n 

Ki{x,y) = ^40fc(x)0^(?/), 

k=l 

and let xi denote the determinantal point process with random kernel Kj. Then x ^i-nd xi 
have the same distribution. Therefore, let K he a, good kernel and for C A we write 
K£,{x,y) = lD{x)K{x,y)lD{y)- Let D be such that Kd is trace-class, with eigenvalues Afc, 
k > 1. Then x(-D) has the same distribution as ^^^k where are independent Bernoulli 
random variables with P{$,k = 1) = Xk and P{^k = 0) = 1 — A^. Now we can state the main 
result of this section: 

Theorem 5.2. Consider a sequence {Xn)n of determinantal point processes on A with good 
kernels Kn- Let Dn be a sequence of measurable subsets of A such that {Kn)Dn ^■^ trace class. 
Assume that (a„)„ is a sequence of real numbers such that 

'^^""^^maxi<,<„(Ar(l-Ar))V2- 

Then {Zn)n with 

_ 1 Xn{Dn)-nXn{Dn)) 

• — , 

«n ^Y{Xn{Dn)) 

satisfies a moderate deviation principle with speed a\ and rate function I{x) = 

Remark 5.3. Obviously we have maxi<j<„(A"(l — A"))-*^/^ < |. To assure that (a„)n is 
growing to infinity, it is necessary that Y{xn{Dn)) goes to infinity. Moreover, under the 
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X 



assumptions of Theorem 15.21 

k k 

thus for a moderate deviation principle, it is necessary that hm^^oo J^, Kn{x,x)dfj,nix) = 

+ 00. 

Proof of Theorem I5.H We only have to check a moderate deviation principle for the rescaled 
partial sums of independent Bernoulli random variables with P{C,k = 1) = ^k- Therefore 
we apply Theorem 11.21 Take XJ^ := , ^ =. Then we obtain easily that condition (11. 4p 

is satisfied for XJ! with 7 = and a constant Kn = 1. □ 

Example 5.4 {Eigenvalues of the GUE/GOE). Let D = [-a,b] with a,b > and a e 
(— |, |), and Dn := n°'D. Consider the determinantal point process of Example 15.11 Then 
Zn/ttn satisfies a moderate deviation principle; see [H 4.2.27], where V(xn(-Dri)) — ?■ 00 is 
proved applying an upper bound with the help of the sine-kernel. Note, that the same 
conclusions hold when the GUE is replaced by the GOE (Gaussian orthogonal ensembles), 
see d 4.2.29]. 

Example 5.5 {Sine-, Airy- and Bessel point processes). Recall the sine-kernel Ksine{x,y) = 
ism{x^-y)_ ^j^jp]-^ arises as the limit of many interesting point processes, for example as a 
scaling limit in the bulk of the spectrum in the GUE. With A = M and ji to be the Lebesgue 
measure, the corresponding operator is locally admissible and determines a determinantal 
point process on M. The operator is not of trace class but locally of trace class. For = 
[—n,n], consider Kn = lo^Ksine- The Central Limit Theorem for the rescaled Xn{Dn) was 
proved by Costin and Lebowitz in 1995. They proved that V(xn(-D„)) goes to infinity. Hence 
a moderate deviation principle for the appropriately rescaled sine kernel process follows. It 
was shown in [32], that the condition lim„_^oo V(xn(-Dn)) = +00 is satisfied for the Airy 
kernel KAiry with Dn = [—n,n], and for Bessel kernel Ksessei with Dn = [—n,n]. In these 
cases, the growth of Y{xn{Dn)) is logarithmic with respect to the mean number of points 
in Dn- For a proof that the Airy process has a locally admissible kernel which determines 
a determinantal point process, see [U 4.2.30]. The Airy kernel arises as a scaling limit at 
the edge of the spectrum in the GUE and at the soft right edge of the spectrum in the 
Laguerre ensemble, while the Bessel kernel arises as a scaling limit at the hard left edge in 
the Laguerre ensemble. We conclude a moderate deviation principle for the corresponding 
kernel point processes. For details and more examples like families of kernels corresponding 
to random matrices for the classical compact groups, see [3T] . 

6. Proof of Theorem 11.11 

The following lemma is an essential element of the proof of Theorem 11.11 Rudzkis, Saulis 
and Statulevicius showed in 1978, that condition (II. 3p on the cumulants implies the following 
large deviation probabilities: 
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Lemma 6.1. Let Z be a centered random variable with variance one and existing absolute 
moments, which satisfies 

IrJ < (j!)i+VA^-2 forallj = 3A,... 



for fixed 7 > and A > 0. Then 

P{Z > x) 



and 



1 - $(a;) 
P(Z < -x) 



exp(L^(x)) ( 1 + qiip{x 



x + 1 



exp(L^(-x)) (1 + q2'4'{x 



X 



hold in the interval < x < A^, using the following notation: 

V2 



- —A 
6 I 6 



60 (1 + 10A2 exp(-(l - x/A^)v^)) 
1 - x/A^ 



(6.22) 



gi,g2 o?^e two constants in the interval [—1, 1] and Lj is a function (defined in [301 Lemma 
2.3, eq. (2.8)]^ satisfying 

|3 

for all X with \x\ < A^ . (6.23) 



\L-y{x)\ < 



\x\ 



3A, 



For the proof see [301 Lemma 2.3]. 

Lemma 6.2. In the situation of Lemma \6.1\ there exist two constants 6*1(7) and C2{'^) , which 
depend only on 7 and satisfy the following inequalities: 



for allO<x< Ci(7)A 



and 



1/(1+27) 



log 
log 



P{Z > x) 



1 - $(x' 
P{Z < -X 



$(-x) 



^ ^2(7) ^1/(1+27) 



Proof. In [13] these bounds were concluded from the previous Lemma 16.11 The proof here is 
analogue to the proof of [HJ Corollary 3.1]. In the situation of Lemma [6.11 the function il) 
defined in fl6.22p is bounded by %Ij{x) < Ci + C2A2 exp(— ca-^/A^) for all < x < gA^ for any 
fixed constant q G [0, 1) and some positive constants Ci, C2 and C3 depending on q only. The 

/ /2 . \ 1/(1+27) 

term c\ + C2A^exp(— csy A^j can be bounded uniformly in A^ = g ( ^Aj , which 

combined with the estimation ( 16.23^ implies the existence of universal positive constants 
C4, C5 and cg, such that 



( -c^x" 

^^Pl,AV(l+27) 



C6(l + X) 



PiZ > x) 
< ^ r. ' < exp 



Al/(l+27)y - 1 - $(a;) 



C5X 



AV(l+27) 



cejl + x)- 

Al/(l+27) , 
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holds for all < X < C4A^/(i+2t). If A^/^^+^t) < 3cq, we can choose Ci(7) and C2{l) such 
that the first inequality in Lemma 16.21 is satisfied. In the case 

^1/(1+27) > we have for 



all < X < 



3c6 



C6(l + X) 



AV(l+27) 



If Ai/(i+27) > and < a; < 



log 



P{Z < -x) 



< 



Al/(l+27) 



AV(i+2t) 
3c6 

+ max 



C6 1 2 

< h - < - 

- AV(i+27) 3 - 3 ■ 

hold, we can bound 

/ Cq{1+x)\ 



log 1 + 



Cq{1 + x) 

AV(l+27) 



Due to the concavity of the logarithm the absolute value of the straight line 



3 log 3 3 log 3 



2 2 

is bigger or equal than the absolute value of log(a;) for any | < x < |. And we have 

3 log 3 



log(l < 



-y 



and I log(l + y)\ < y ioi any < ?/ < |. Thus for ^^1^'^+'^'^) > 3c6 and < x < 
follows that 

P{Z < -x) 



^1/(1+27) . 



3c6 



it 



log 



< 



C5X 



Al/(l+27) 



^ 31og3c6(l +x) ^ 



C5X 



^1/(1+27) - ^1/(1+27) 



+ 



log 3 C6(5 + x^) 

~^ ^1/(1+27) 



applying x'^ — 3x + 2 = (x — l)^(x + 2) > which is equivalent to 3(1 + x) < 5 + x^. Thus the 
first inequality in Lemma 16.21 is proved. The second inequality in Lemma 16.21 can be proved 
similarly. □ 



Proof of Theorem \l.l[ The idea of the proof is similarly to the proof of [SI Lemma 3.6] for 
the case of bounded geometric functionals. It follows from Lemma 16.21 that in the situa- 
tion of Theorem 11.11 there exist two constants 6*1(7) ^-^id 6*2(7), which satisfy the following 
inequalities: 



for all < y < Ci(7)A 



and 



1/(1+27) 



log 
log 



P{Zn > y) 



1 - ^{y) 

P{Zn < -y) 



1 + r 



A 



1/(1+27) 



<f(-y) 



< 6*2(7) 



A 



1/(1+27) 



. The logarithm can be represented as 



log 



1 - $(a„x) 



log 



P i^Zn > X 

e 2 (1 - $(a„x)) 



e 2 



logP —Zn>x 



[a^x) 



log(^e'''"2"' (1 - $(a„x)) 
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For the term at the left-hand side we can use the bounds provided by Lemma for y = a^x 

A 1/(1+27) 

and < X < Ci{'^)—^ . Note that the bound for x grows to infinity as n does, thus it 

does not imply any restriction. Since, for all y > 0, we have 

1^ <e'^(l-$(2/)) < - 



the monotonicity of the logarithm implies 



logP —Zn > X 

an 



yOinXj 



< 



< 



logj^e^'"^^ (1 - $(a„x)) 
log 



l + janxY 
+ C'2(7)-i7(T^ 



1 



2'KanX 



+ ^^(^) A 1/(1+2.) 



< log(2 + V2^a^x) + ^2(7) \"^va+2l' • 



And it follows that 
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< — log(2 + V27ra„x) +^2(7) 
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2 A 1/(1+27) 



— log(2 + V27ra„x) +6*2(7) 



2 A 1/(1+27) 
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Similarly we can prove 
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— logP — Z„<-x + — 



X 



< — log (2 + V2^a„x) + 6*2(7) 



a,; 

71— >00 



n 



1 + {anxy 

2 A 1/(1+27) 



0. 



These bounds can be carried forward to a full moderate deviation principle analogue to the 
proof of [m Theorem 1.2]. □ 
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